MDEV-18386: Add server_audit_timestamp_format to customize audit log timestamps#4633
MDEV-18386: Add server_audit_timestamp_format to customize audit log timestamps#4633abhishek593 wants to merge 1 commit intoMariaDB:mainfrom
Conversation
gkodinov
left a comment
There was a problem hiding this comment.
This is a preliminary review. Thank you for your contribution!
gkodinov
left a comment
There was a problem hiding this comment.
it was a good first iteration. Thanks for working on that. Another round of comments coming up.
gkodinov
left a comment
There was a problem hiding this comment.
LGTM after fixing the below. Please expect a final review.
plugin/server_audit/server_audit.cc
Outdated
| NULL, NULL, 1024, 0, 0x7FFFFFFF, 1); | ||
| static MYSQL_SYSVAR_STR(timestamp_format, timestamp_format, | ||
| PLUGIN_VAR_RQCMDARG, | ||
| "The format string the strftime() routine applies with its format argument", |
There was a problem hiding this comment.
I'd say: a format string used to print the timestamp into the audit log messages. The format used is the same as strftime. Default is empty, meaning use the existing hard-coded format.
plugin/server_audit/server_audit.cc
Outdated
| if (timestamp_format_buffer[0]) | ||
| ts_len= strftime(ts_tmp, sizeof(ts_tmp), timestamp_format_buffer, &tm_time); | ||
|
|
||
| if (ts_len == 0) |
There was a problem hiding this comment.
just do "else" here. Easier to read.
plugin/server_audit/server_audit.cc
Outdated
| { | ||
| if (!(is_active= (logger_write(logfile, message, len) == (int) len))) | ||
| struct tm tm_time; | ||
| char ts_tmp[TIMESTAMP_OUTPUT_LENGTH]; |
There was a problem hiding this comment.
256 bytes in the stack ... this is a bit much. I'd allocate a buffer outside of the lock and pass it down. Or, better, it's going at the start of message anyway, can't you just print it there directly instead of using a temp buffer and then memcpy() into it?
There was a problem hiding this comment.
Yeah, so here's what I changed. Removed the temp buffer, and directly stored the output of strftime in message buffer at the beginning, then used memmove to shift the output from the start to the end of reserved space.
plugin/server_audit/server_audit.cc
Outdated
| /** | ||
| Write to the log, acquiring the lock. | ||
| */ | ||
| */ |
There was a problem hiding this comment.
don't do white-space only changes please!
plugin/server_audit/server_audit.cc
Outdated
| const char *userip, unsigned int userip_len, | ||
| unsigned int connection_id, unsigned int port, | ||
| long long query_id, const char *operation) | ||
| const char *serverhost, size_t serverhost_len, |
There was a problem hiding this comment.
no white-space only changes please
plugin/server_audit/server_audit.cc
Outdated
| time_t ctime; | ||
| size_t csize; | ||
| char message[1024]; | ||
| char raw_message[1024 + TIMESTAMP_OUTPUT_LENGTH]; |
There was a problem hiding this comment.
I'd use some named const size_t for the 1k constant
plugin/server_audit/server_audit.cc
Outdated
|
|
||
| (void) time(&ctime); | ||
| csize= log_header(message, sizeof(message)-1, &ctime, | ||
| csize= log_header(message, 1024-1, |
There was a problem hiding this comment.
why do you need to do this? Can't you just pass the actual length as you used to do? And then do the calculation inside this func?
There was a problem hiding this comment.
Previously message was a fixed size array (1024). In the new version, its a pointer to an offset within a large buffer, so we must explicitly pass the correct size (MAX_AUDIT_PAYLOAD_LENGTH).
plugin/server_audit/server_audit.cc
Outdated
| time_t ctime; | ||
| size_t csize; | ||
| char message[1024]; | ||
| char raw_message[1024 + TIMESTAMP_OUTPUT_LENGTH]; |
There was a problem hiding this comment.
same here: use a named constant for 1k
plugin/server_audit/server_audit.cc
Outdated
| timestamp_format); | ||
| CLIENT_ERROR(1, "server_audit_timestamp_format can't exceed %d characters.", | ||
| MYF(ME_WARNING), TIMESTAMP_FORMAT_LENGTH - 1); | ||
| return; |
There was a problem hiding this comment.
validity checks must be done in the check callback, not in the update callbac
plugin/server_audit/server_audit.cc
Outdated
| PLUGIN_VAR_RQCMDARG, | ||
| "A format string used to print the timestamp into the audit log messages " | ||
| "The format used is the same as strftime and the default is empty which uses " | ||
| "the existing hard-coded format", |
There was a problem hiding this comment.
Small issue: don't make it "empty = hard-coded", it's not very intuitive, simply set the default value of timestamp_format to the correct (backward compatible) value.
Big issue: don't use strftime. The server already has date-to-string formatting function DATE_FORMAT(). We should not let different parts of the server to implement formatting differently. For example, %M should always and everywhere have the same meaning.
That is, invoke server's date-to-string function. And to do that, it must be in a service. But, perhaps, you can add a new method to the timezone service and not create a new one.
There was a problem hiding this comment.
I have updated the implementation and added a new method in the timezone service.
vuvova
left a comment
There was a problem hiding this comment.
Looks quite good. The service is almost correct and its usage is correct. %z should print global time zone. And I don't quite understand why you moved time->string conversion away from log_header
| select 5; | ||
| 5 | ||
| 5 | ||
| set global server_audit_timestamp_format='%Y%m%d %H:%i:%s'; |
There was a problem hiding this comment.
you can try set global server_audit_timestamp_format=default instead
| set global server_audit_timestamp_format='CUSTOM-DATE %Y-%m-%d'; | ||
| select 2; | ||
|
|
||
| # Custom format 2: Including timezone (if supported by system strftime) |
There was a problem hiding this comment.
you can now remove references to the "system strftime" from comments
plugin/server_audit/server_audit.cc
Outdated
| #endif | ||
| #ifndef MAX_AUDIT_QUERY_LENGTH | ||
| #define MAX_AUDIT_QUERY_LENGTH 2048 | ||
| #endif |
There was a problem hiding this comment.
let's drop these #ifndef/#endif, simply define what you want. Easier to read and we don't really expect anyone to define this outside of the plugin
plugin/server_audit/server_audit.cc
Outdated
| "%04d%02d%02d %02d:%02d:%02d,%.*s,%.*s,%.*s%s,%d,%lld,%s", | ||
| tm_time.tm_year+1900, tm_time.tm_mon+1, tm_time.tm_mday, | ||
| tm_time.tm_hour, tm_time.tm_min, tm_time.tm_sec, | ||
| ",%.*s,%.*s,%.*s%s,%d,%lld,%s", |
There was a problem hiding this comment.
I thought it'd be natural to generate the timestamp string here and print like
"%s,%.*s,%.*s,%.*s%s,%d,%lld,%s", ts_start, ...
Why did you move it to write_log() ?
There was a problem hiding this comment.
I initially had the implmentation in log_header(). However I had to take a lock there for the timestamp buffer. So, overall it took 2 locks for an audit log. Moving it to write_log(), we will have to take just a single lock (existing one). That would be faster. Quoting gkodinov's comments here:
On your "taking the lock" question.
I'm not very familiar with with the code, but I'll take a stab:
What audit does (from what I can see) is:
char message[...]; prepare_the_message_as_much_as_possible(message); rd_lock(); <the variable protected with the rw lock>->write_log(message); release_rd_lock();The above is done to decrease the time needed to hold on the lock.
Note that this lock is later expanded to also cover all of the config values (in addition to the logger_service variable). Not sure it's a great idea, but I'm not the one to judge.
And now you're adding a need to take the lock again during prepare.
Taking a read lock is usually considered expensive afair. So taking the lock twice is definitely not a good idea IMHO. One option would be to move the strftime() call inside the existing lock boundaries. You know the max length already. So you can probably prepare a buffer for it and concat the two just before sending. This will be a good idea if strftime does take longer and locks more compared to strcpy() and the cost of extra memory.
The other option is to extend the scope of the read lock to cover preparation too. This is easier and the effect is probably less, since the only important write lock acquisitions are during the SET for the plugin variables.
I'd personally advise towards trying the former for like a day or two. But please make sure not to call any allocs while doing that. This will have an even more devastating effect on performance than the lock extension. If it proves too hard I'd go with the latter.
9e88050 to
8c3ac1c
Compare
8c3ac1c to
2435bd2
Compare
sql/item_timefunc.cc
Outdated
| curr_timezone= thd->variables.time_zone; | ||
| curr_timezone->get_timezone_information(&curr_tz, l_time); | ||
| } | ||
| tz->get_timezone_information(&curr_tz, l_time); |
There was a problem hiding this comment.
well, the goal of the old code was to avoid multiple calls of the expensive get_timezone_information().
I think we can keep this logic, it doesn't complicate the code much
There was a problem hiding this comment.
I misunderstood that the current changes are fine. I have now made the necessary changes.
plugin/server_audit/server_audit.cc
Outdated
| } | ||
| else if (output_type == OUTPUT_SYSLOG) | ||
| { | ||
| /* Syslog doesn't need the audit timestamp. Skip the leading comma. */ |
There was a problem hiding this comment.
dunno, looks hackish. let's not add this comma in write_log, not
in log_header. then you won't need to remove it here
There was a problem hiding this comment.
Okay, I have removed the comma from log_header(), and adding it now in write_log().
plugin/server_audit/server_audit.cc
Outdated
| @param lock whether the caller did not acquire lock_operations | ||
| */ | ||
| static int write_log_maybe_lock(const char *message, size_t len, bool lock) | ||
| */ |
2435bd2 to
2fdc052
Compare
vuvova
left a comment
There was a problem hiding this comment.
Just one comment, I'll mark the PR tentatively approved.
But I don't like your string manipulations, how you print the message leaving empty space, then memmove timestamp around. It's difficult to follow and looks hacky.
Thanks to your explanation, I understand why you did it and I don't see how to improve the readability of your solution. So I've asked the original developer of the audit plugin to take a look, in case he'd have a cleaner idea how to achieve the same. Or he can just approve and push.
sql/item_timefunc.cc
Outdated
| curr_timezone= thd->variables.time_zone; | ||
| curr_timezone->get_timezone_information(&curr_tz, l_time); | ||
| } | ||
| tz->get_timezone_information(&curr_tz, l_time); |
2fdc052 to
ddb9dd0
Compare
|
I agree that part looks hacky, but the goal is purely performance. We don't have to create another buffer to pass around. If there's a better way, I'm happy to refactor. |
holyfoot
left a comment
There was a problem hiding this comment.
Ok to push.
I'd stay for the simpler approach to keep the timestamp_format_buffer command-line only. But since the dynamic behaviour is implemented already, i'm fine with this.
|
@abhishek593 , thanks, looks good. This is what happens now: MDEV-18386 is moved to the "In-Testing" status, if everything is fine, a tester will approve it and it'll be pushed into main in time for the 13.0.1 release which is planned for the beginning of May. There is nothing you need to to anymore, unless testing finds bugs. |
This PR adds a new global system variable, server_audit_timestamp_format, to the MariaDB Audit Plugin.
Key changes: